Learning Bilingual Sentiment Word Embeddings for Cross-language Sentiment Classification

نویسندگان

  • Huiwei Zhou
  • Long Chen
  • Fulin Shi
  • Degen Huang
چکیده

The sentiment classification performance relies on high-quality sentiment resources. However, these resources are imbalanced in different languages. Cross-language sentiment classification (CLSC) can leverage the rich resources in one language (source language) for sentiment classification in a resource-scarce language (target language). Bilingual embeddings could eliminate the semantic gap between two languages for CLSC, but ignore the sentiment information of text. This paper proposes an approach to learning bilingual sentiment word embeddings (BSWE) for English-Chinese CLSC. The proposed BSWE incorporate sentiment information of text into bilingual embeddings. Furthermore, we can learn high-quality BSWE by simply employing labeled corpora and their translations, without relying on largescale parallel corpora. Experiments on NLP&CC 2013 CLSC dataset show that our approach outperforms the state-of-theart systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-Lingual Sentiment Classification with Bilingual Document Representation Learning

Cross-lingual sentiment classification aims to adapt the sentiment resource in a resource-rich language to a resource-poor language. In this study, we propose a representation learning approach which simultaneously learns vector representations for the texts in both the source and the target languages. Different from previous research which only gets bilingual word embedding, our Bilingual Docu...

متن کامل

Cross-lingual Sentiment Lexicon Learning With Bilingual Word Graph Label Propagation

In this article we address the task of cross-lingual sentiment lexicon learning, which aims to automatically generate sentiment lexicons for the target languages with available English sentiment lexicons. We formalize the task as a learning problem on a bilingual word graph, in which the intra-language relations among the words in the same language and the interlanguage relations among the word...

متن کامل

Attention-based LSTM Network for Cross-Lingual Sentiment Classification

Most of the state-of-the-art sentiment classification methods are based on supervised learning algorithms which require large amounts of manually labeled data. However, the labeled resources are usually imbalanced in different languages. Cross-lingual sentiment classification tackles the problem by adapting the sentiment resources in a resource-rich language to resource-poor languages. In this ...

متن کامل

Improving the Accuracy of Pre-trained Word Embeddings for Sentiment Analysis

Sentiment analysis is one of the well-known tasks and fast growing research areas in natural language processing (NLP) and text classifications. This technique has become an essential part of a wide range of applications including politics, business, advertising and marketing. There are various techniques for sentiment analysis, but recently word embeddings methods have been widely used in sent...

متن کامل

Exploring Distributional Representations and Machine Translation for Aspect-based Cross-lingual Sentiment Classification

Cross-lingual sentiment classification (CLSC) seeks to use resources from a source language in order to detect sentiment and classify text in a target language. Almost all research into CLSC has been carried out at sentence and document level, although this level of granularity is often less useful. This paper explores methods for performing aspect-based cross-lingual sentiment classification (...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015